Supervised Word Sense Disambiguation using Python

نویسنده

  • Phil Katz
چکیده

In this paper, we discuss the problem of Word Sense Disambiguation (WSD) and one approach to solving the lexical sample problem. We use training and test data from SENSEVAL-3 and implement methods based on Naı̈ve Bayes calculations, cosine comparison of word-frequency vectors, decision lists, and Latent Semantic Analysis. We also implement a simple classifier combination system that combines these classifiers into one WSD module. We then prove the effectiveness of our WSD module by participating in the Multilingual Chinese-English Lexical Sample Task from SemEval-2007.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains

One of the weaknesses of current supervised word sense disambiguation (WSD) systems is that they only treat a word as a discrete entity. However, a continuous-space representation of words (word embeddings) can provide valuable information and thus improve generalization accuracy. Since word embeddings are typically obtained from unlabeled data using unsupervised methods, this method can be see...

متن کامل

Review: Semi-Supervised Learning Methods for Word Sense Disambiguation

Word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying the appropriate sense of a word in a sentence, when the word has multiple meanings. Many approaches have been proposed to solve the problem, of which supervised learning approaches are the most successful. However supervised machine learning are limited by the difficulties...

متن کامل

Semi-supervised Clustering for Word Instances and Its Effect on Word Sense Disambiguation

We propose a supervised word sense disambiguation (WSD) system that uses features obtained from clustering results of word instances. Our approach is novel in that we employ semi-supervised clustering that controls the fluctuation of the centroid of a cluster, and we select seed instances by considering the frequency distribution of word senses and exclude outliers when we introduce “must-link”...

متن کامل

Learning a Robust Word Sense Disambiguation Model using Hypernyms in Definition Sentences

This paper proposes a method to improve the robustness of a word sense disambiguation (WSD) system for Japanese. Two WSD classifiers are trained from a word sense-tagged corpus: one is a classifier obtained by supervised learning, the other is a classifier using hypernyms extracted from definition sentences in a dictionary. The former will be suitable for the disambiguation of high frequency wo...

متن کامل

Four Methods for Supervised Word Sense Disambiguation

Word sense disambiguation is the task to identify the intended meaning of an ambiguous word in a certain context, one of the central problems in natural language processing. This paper describes four novel supervised disambiguation methods which adapt some familiar algorithms. They built on the Vector Space Model using an automatically generated stop list and two different statistical methods o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007